Bleu: a Method for Automatic Evaluation of Machine Translation
نویسندگان
چکیده
Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. We propose a method of automatic machine translation evaluation that is quick, inexpensive, and languageindependent, that correlates highly with human evaluation, and that has little marginal cost per run. We present this method as an automated understudy to skilled human judges which substitutes for them when there is need for quick or frequent evaluations.1
منابع مشابه
Measuring Confidence Intervals for the Machine Translation Evaluation Metrics
Automatic evaluation metrics for Machine Translation (MT) systems, such as BLEU and the related NIST metric, are becoming increasingly important in MT. This paper reports a novel method of calculating the confidence intervals for BLEU/NIST scores using bootstrapping. With this method, we can determine whether two MT systems are significantly different from each other. We study the effect of tes...
متن کاملAutomatic Evaluation for a Palpable Measure of a Speech Translation System's Capability
The main goal of this paper is to propose automatic schemes for the translation paired comparison method. This method was proposed to precisely evaluate a speech translation system's capability. Furthermore, the method gives an objective evaluation result, i.e., a score of the Test of English for International Communication (TOEIC). The TOEIC score is used as a measure of one's speech translati...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملFeedback Cleaning of Machine Translation Rules Using Automatic Evaluation
When rules of transfer-based machine translation (MT) are automatically acquired from bilingual corpora, incorrect/redundant rules are generated due to acquisition errors or translation variety in the corpora. As a new countermeasure to this problem, we propose a feedback cleaning method using automatic evaluation of MT quality, which removes incorrect/redundant rules as a way to increase the e...
متن کاملInterpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?
Automatic evaluation metrics for Machine Translation (MT) systems, such as BLEU and the related NIST metric, are becoming increasingly important in MT. Yet, their behaviors are not fully understood. In this paper, we analyze some flaws in the BLEU/NIST metrics. With a better understanding of these problems, we can better interpret the reported BLEU/NIST scores. In addition, this paper reports a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002